ON THE SATISFIABILITY OF RANDOM REGULAR SIGNED SAT 

FORMULAS 



CHRISTIAN LAUS AND DIRK OLIVER THEIS 

ABSTRACT. Regular signed SAT is a variant of the well-known satisfiability problem in 
which the variables can take values in a fixed set V C [0, 1], and the literals have the form 
"x < a" or "x > a" instead of "x" or "x". 

We answer some open question regarding random regular signed fc-SAT formulas: The 
probability that a random formula is satisfiable increases with | V\ ; there is a constant upper 
bound on the ratio m/n of clauses m over variables n, beyond which a random formula is 
asypmtotically almost never satisfied; for k = 2 and V = [0, 1], there is a phase transition 
at m/n = 2. 
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1. Introduction 

Let V be a set with at least two elements, S a set of subsets of V called signs, and k a 
positive integer. For the signed k-satisfiability problem, or signed k-SAT, one is given as 
input a set of n variables X and a formula in signed conjunctive normal form. This means 
that there is a list of m clauses, each of which is a conjunction (A) of k (signed) literals of 
the form "x € S" where x is a variable in X and the "sign" S is a set in S. The question 
is then whether there exists a satisfying interpretation, i.e., an assignment of values to the 
variables such that each of the clauses is satisfied. 

Historically, signed SAT originated in the area of so-called multi-valued logic, where 
variables can take a certain number of truth values, not just or 1. This is why the set V 
is called the truth-value set. We refer the reader to the survey paper [4], and the references 
therein. 

In the signed /c-SAT area, the variants where the literals are inequalities have received 
special attention (see also [12, 3]). One speaks of regular signed /c-SAT or just regular 
k-SAT (k-rSAT for short), if V is a (linearly) ordered set, and the allowed signs are {x \ 
x < a} and {x \ x > a}, a G V. The literals then have the form "x < a" or "x > a". 
Clearly, the satisfiability of the formula depends only on v := | V| € {2, 3, ... , oo} rather 
than on the set V itself, so we always assume that V C R and min V = 0, and max V = 1. 

This setting includes as a special case the classical satisfiability (SAT) problem: choose 
for V the 2-element set {0, 1}, and use the signed literals x > 1 and x < to represent the 
classical SAT literals x and x, respectively. 
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This paper is about fc-rSAT formulas drawn at random from all such formulas with a fixed 
truth- value set V. We allow that either v < oo, or, in the limit, V = [0, 1]. These random 
formulas are studied for m = cn, for a fixed constant c, in the limit n — > oo. Literals 
x < 1 and x > are innocuous: they only affect the number of clauses which remain 
to be satisfied. We will drawn formulas uniformly at random from the set of all signed 
fc-SAT formulas with n literals, m clauses and truth-value set V, which do not contain any 
innocuous literal. 1 

Based on computational investigations of the satisfiability of uniformly generated ran- 
dom 3-rSAT instances, Manya et al. [14] have made a number of observations and conjec- 
tures. Most importantly, they observed a phase transition phenomenon similar to the one in 
classical SAT (see, e.g., [10, 1] and the references therein). They interpret their results as 
supporting the existence of a threshold c = <%(?;), for k = 3, with the following properties: 

(i) the most computationally difficult instances tend to be found when the ratio m /n is 
close to Cfc(t>); 

(ii) there is a sharp transition from satisfiable to unsatisfiable instances when the ratio 
ra/n crosses the threshold; 

(iii) Ck{v) is nondecreasing in the number of truth- values v. 

Their results are confirmed and extended by other papers exploring uniformly random 3- 
rSAT instances [5, 4, 6]. From their computational data, Bejar et al. [5, 6] surmise that 

(iv) the threshold Ck{v) increases logarithmically with v, 

and prove that for 

c > log 8/7 (f) (1) 

a random 3-rSAT formula with m /n = c is asymptotically almost never (a.a.n., as n — > oo) 
satisfiable. Their proof (and bound) resembles that for classical /c-SAT (2 k log 2). 

Our contributions. In this paper, we prove (ii) for k = 2 and V = [0, 1]; establish (iii) for 
all k; improve the bound (1) for large v; and falsify (iv). 

To elaborate, for (iii), we show that the probability that a random formula is satisfied 
increases with v. In particular, the probability that a random formula on a finite truth- 
value set is satisfiable is bounded from above by the probability that a random formula 
with V = [0, 1] is satisfiable. Thus, if cs(v) increased logarithmically with v, then for any 
finite c, a random formula with truth- value set [0, 1], n variables and m = cn clauses would 
be asymptotically almost surely (a.a.s.) satisfiable. 

We then prove the following. 

Theorem 1. If c> lis such that 

kc(i-2~ k y~ 1 < i, 



If innocuous literals are allowed in a random formula, the number of clauses which contain an innocuous 
literal is distributed like a binomial variable with parameters m — 0(?i) and l/v = 0(1), and as such is with 
high probability 0(y / n), so that the ratio c = m /n is unaffected for n — > oo. Hence, forbidding innocuous 
literals does not change asymptotic results. 
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then a random k-rSAT formula with n variables, m = cn clauses, and V = [0, 1] is a.a.n. 
satisfiable. 

This improves on Bejar et al.'s [6] bound (1) for large values of v. Most notably, it gives 
a finite upper bound for all v and thus disproves (iv). In particular, Theorem 1 implies the 
following. 

Corollary. For all V, a random 3-rSAT formula with n variables and m = cn clauses is 
a.a.n. satisfiable, if c > 36.1. □ 

We then move on to study 2-rSAT. Here, Theorem 1 gives an upper bound of apx. 12.664 
beyond which a random 2-rSAT is a.a.n. satisfiable. To prove a lower bound beneath which 
such a formula is satisfiable, we use a result by Chepoi et al. [7], who prove a characteri- 
zation of non-satisfiable signed 2-SAT instances based on a digraph certificate, in the spirit 
of Aspvall, Plass, and Tarjan's famous result for classical 2-SAT [2]. Using Chepoi et al.'s 
characterization we prove the following. 

Theorem 2. A random 2-rSAT formula with n variables, m = cn clauses, and \ V\ = oo is 

(a) a.a.s. satisfiable, ifc<2, and 

(b) a.a.n. satisfiable, ifc>2. 

The improved upper bound here comes from a concentration result. This theorem shows 
that, as for classical random fc-SAT, /c-rSAT exhibits a threshold behaviour if k = 2. 

The main difficulty in the last theorem for 2-rSAT, as compared to classical 2-SAT, comes 
from the fact that there is an infinite number of possible literals — as opposed to the to- 
tal of 2n possible literals for classical SAT. In our proof, we make use of the following 
trick. When conditioning on the number of times Rj each variable Xj occurs, the structure 
one needs to analyze has some resemblance to the configuration model for random (multi) 
graphs with prescribed degrees Rj. This way, Chvatal and Reed's [8] approach for classical 
2-SAT can be adapted. 

Organization of the paper. The remainder of the paper is organized as follows. In the next 
section, we discuss the random model, and variants of it, in the necessary details and prove 
the monotonicity of the probability of satisfiability mentioned above. Section 3 contains the 
proof of Theorem 1. Sections 4 and 5 hold the proof of Theorem 2. In the final section, we 
discuss a few open questions. 

2. Basics about random fc-rSAT 

In this section, we discuss variants of the random model which we need. Then we will 
prove some basic facts about /c-rSAT. 

We will think of a random formula as being constructed as follows. First of all, we 
assume that the truth- value set V is a subset of the unit interval [0, 1] which is symmetric 
(i.e., V = 1 — V) and which contains both and 1. 

Now we take an "empty" formula, i.e., we have m x k empty slots. Each slot will be 
filled by a triple (x, g, a) where x is one of the variables, g is a comparison relation "<" 
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or ">", and a is in V \ {1}. The interpretation of such a triple is that, if g = "<", then 
we have the condition x < a, whereas, if g = ">", we have the condition x > 1 — a. 
By this construction, we exclude the cases of the inequalities x > and x < 1, which are 
meaningless because they do not constrain x. The part (g, a) is referred to as the constraint 
part of the literal. 

For each slot, the three parts of the triple are chosen independently from each other. The 
selection of the right hand sides and the comparison relations is done independently for 
all slots. For the variables, there are several possibilities. First of all, their selection may 
either be chosen independently for all slots (i.e., allowing a clause to contain more than one 
slot with the same variable), or indepdently for all slots but conditioning on the k variables 
occuring in a clause being distinct. In the second case, the event on which we condition 
is asymptotically bounded away from 0. Hence, as far as a.a.s. statements about random 
formulas are concerned, the two possibilities for the random selection of the variables are 
equivalent. We denote by Fk(n,m,v) a random A;-rSAT formula with truth- value set of 
cardinality v in which, for each clause, the variables in the slots are distinct; by Fi{n, to, v), 
we denote a random formula where the variables can occur multiple times in the same 
clause. 

Secondly, we may choose the variables conditioning on the number of times each vari- 
able occurs in the formula. For a random formula F, let the random variable Rj denote the 
number of slots containing variable Xj. Clearly, we have 

n 

Rj = km. (2) 

i=i 

If, when choosing the variables, we allow a clause to contain more than one slot with the 
same variable, then the R := (Rj)j=i,..., n has multinomial distribution, i.e., for all r € tNl n 
with J2j r j = km we have 

/ km \ 

P[R = T ] = (3) 

This is same as the Rj being independent Poison with mean km/n conditioning on (2). 

When constructing a random formula, we may reverse this view: We may condition on 
the values of R. This amounts to pretending that, for j = 1, . . . , n, there are Rj distin- 
guishable copies of variable Xj, and the km variable copies are assigned to the km slots 
randomly. 

Monotonicity. We now come the some basic facts about random fc-rSAT formulas. We 
start the monotonicity property of rSAT formulas mentioned in the introduction. Denote by 

Pk(n, m, v) := P[.Ffc(m, n, v) is satisfiable] (4) 

the probability that a random /c-rSAT formula with m clauses on n variables and truth- 
value set of cardinality |V| is satisfiable. We will habitually omit the k. Naively speaking, 
increasing \V\ increases the possible choices for the variables, so we would guess that 
p(n, to, v) increases with v. That is in fact the case. 
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The easiest setting in which we can visualize this phenomenon is, if we suppose that the 
right hand sides are of the form 

A 

A = Y^Bi2-\ 

i=i 

where the Bi, i = 1,2,3, .. . are independent Bernoulli random variables with P[Bj = 
1] = 1/2, and A is either finite — in which case |V| = 2 A + 1 — or A = oo, in which 
case V = [0, 1]. Now note that increasing A increases A. But this weakens the inequalities 
constraining the variables, and thus makes the formula "more satisfiable". An only slightly 
more technical argument proves this monotonicity fact for general \V\. 

Lemma 3. For every k,n,m, the following hold, 

(a) For every v, we have Pk(n, m, v) < Pk(n, m, v + 1). 

(b) For every v, we have pk(n,m,v) < Pk(n, m, oo). 

(c) We have lim„_ > . 00 pfc(n, m, v) = Pk(n,m,oo). 

Proof, (a). Suppose we have V = {u/ (v — 1) \ u = 0, . . . , v — 1}. For a random constraint, 
a value in V \ {1} is drawn uar. We would like increase v ~» v + 1. Firstly, we scale all 
values by the factor (v — 1) /v. This does not influence satisfiability. Secondly, we increase 
u/v to (it + l)/v with probability (u + l)/v, u = 0, ... ,v — 2. This yields the uniform 
distribution on V \ {1} = {u/v \ u = 0, ... ,v — 1}. 

Note that increasing a in a literal (x, p, a) will never make a satisfiable formula unsatis- 
fiable: indeed, the set of satisfying interpretations stays the same. However, if a formula is 
not satisfiable, whenever the value for a in a literal is increased, there is a possibility that 
the formula becomes satisfiable. Thus, the probability that a random formula is satisfiable 
increases with v ~+ v + 1. 

(b) . To prove (b), by (a), it suffices to consider the following sets, for A = 0, 1, 2, . . . , oo: 

A 

V x :={Y / B i 2- i |S6{0,1} N }U{1}, 

i=l 

i.e., V = {0, 1}, V x = {0, 1/2, 1}, V 2 = {0, i/4, 1/2, 3/4, 1},...,V 00 = [0, 1]. 

To prove (b), we now use the method of deferred decisions. We draw B = (B±, B 2 , . . . ) 
randomly regardless of the value of A. For a random formula (n, m, 2 A +1), the B\ , . . . , B\ 
have been exposed. Increasing A oo amounts to exposing all remaining Bi, i = 
A + 1, A + 2, . . . , and adjusting the literals of the formula accordingly. But this can only 
increase the sum, and thus, modifying the literals of a formula in this way can only turn a 
not satisfiable formula into a satisfiable one, and thus can only increase pk(n, m, •). 

(c) . We use the same approach as in (b). Suppose that a formula F := Fk(n, m, oo) is 
satisfiable. We then find a finite A such that truncating the sums at the 2~ A -term already 
yields a satisfiable formula. First of all, we may assume that the literals (x, p, a) all have 
distinct values a. Let A_ be the largest number such that there are literals (x, p, a) and 
(x', p', a') for which the sums in a and a' coincide up to the 2~ A --term. Then A_ is finite. 
Letting A := A_ + 1, we see that in truncated random formula, the constraint parts of the 
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literals have the same relative ordering as in the original formula. Hence, the truncated 
formula is satisfiable. 

Thus, every satisfiable formula for A = oo becomes satisfiable already at a finite value 
for A. This proves the stated convergence. □ 

Since the existence of a threshold is, as of now, conjectural, we let, for v G {2, 3, . . . , oo} 

c^{v) := sup{c | Fh(n,cn,v) a.a.s. satisfiable}, and 

cf(v) := inf{c | Fk{n,cn,v) a.a.n. satisfiable}. 

The existence of a threshold is then equivalent to cT(v) = c^(v); cf. Fig. 1. 




Figure 1. Transition from satisfiable to unsatisfiable for increasing c. 
Note that both c\ and c^ converge to a finite value as v — > oo by Proposi- 
tion 4 and Theorem 1 . 

From Lemma 3, we immediately derive the following. 

Proposition 4. For every k we have the following. 

(a) c^(v) and c£(v) are both nondecreasing with the cardinality v of the truth-value set. 

(b) c^" (u) < c^"(oo) andc£(v) < c£(oo). 

(c) lirna-KjoCfc (u) = (oo) and ]im v ^ 00 c£(v ) = (oo). 

Proof, (a) and (b) follow immediately from their counterparts in Lemma 3. As for (c), let 
c + := lim^_ i . 00 c£(v) and assume that c + < (oo). Then, for c with c + < c < (oo), we 
have that p(n, cn, oo) does not converge to 0, so there is a sequence (n^)^ = i 2 ... for which 
p(ri£, crip, 00) is bounded away from 0. But lim n p(n, cn, v) = 0, contradicting part (c) of 
Lemma 3. 

Similarly, let c~ := lim^oo c^ (v) and assume that c~ < c k {oo). Then, for c with 
cr < c < c^T (00), we have lim n p(n, cn, 00) = 1. But for all v, we have that p(n, cn, v) is 
bounded away from 1. We obtain a contradiction in the same way as above. □ 
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Ancilliary. We conclude the section with the following trivial lemma. 
Lemma 5. IfV = [0, 1], then the following holds. 

(a) For every fixed x € [0, 1], the probability that a random constraint defined by (q, A) is 
satisfied, is 1 /2. 

(b) For every to constraints (q, A), (q 1 , A'), the probability that there is no x E V satisfying 
both (i.e., the signs are disjoint) is 

□ 

3. Proof of Theorem 1 

In this section, we prove Theorem 1. By Lemma 3, it suffices to prove the statement for 
the case when the right hand sides of the formulas are drawn uar from [0, 1]. In particular, 
we can assume that no two literals have the same right hand side. 

An interpretation x — > x of a formula F is called tight,, if for every variable xj, there 
one of the two literals "xj < xf or "x^ > xf occurs in F. In other words, only the right 
hand sides of the inequalities are allowed as values for the variables. The following fact is 
trivial. 

Lemma 6. There exists a satisfying interpretation if, and only if, there exists a satisfying 
tight interpretation. □ 

For a random formula F, denote by Y = Yp the number of satisfying tight interpreta- 
tions. To prove Theorem 1 , we compute the expectation of Y. 

When sampling a random formula, we condition on R = (-Rj)j=i,..., n as discussed 
above. For j = 1, . . . , n and £ = 1, . . . , Rj, let Aj £ be the (random) right hand side in 
the slot containing the Ah copy of the variable Xj. For every I € IljLi {\-> ■ ■ ■ j -^j}> we 
construct an interpretation x — > x{£) by letting Xj(£) := Aj^uy With the sum below 
extending over all £ G rij=i {!> • • • , -Rj}, we have 

Y = ^\\x(t) satisfies F\. 

i 

For every fixed £, we can estimate the probability of the event that x(£) satisfies F. 
Indeed, n literals will be "automatically" satisfied, namely for every variable Xj the one 
containing the £(j)th copy of Xj. Since the right hand sides are drawn independently, for 
each of the remaining literals, the probability of being satisfied by x is 1 /2, by Lemma 5. The 
automatically satisfied literals cover at most n clauses, which leaves (c — l)n clauses, each 
of which contains exactly k of the remaining literals. Conditioned on the assignment X of 
the variable copies to the slots, the probability that all of these clauses are satisfied is thus 
at most 

(l-2- fc ) (c " 1)n . 

Moreover, the event that all the remaining clauses are satisfied depends only on the con- 
straint part of the literals and is thus independent from R 
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Now we can compute the expected number of satisfying tight interpretations. 



E7 = E^E^ I [x(£) satisfies F] 



R 



= e( E(^E(I[x(£) satisfies F] \ X) 

^ (l _ 2 _^( C -l)n E / E ^ 1 | i? \\ 

, x _ 2-*) (<*-!>» E (jjfl.)< (x _ 2 ' k f~ 1)n M n) n = 



3=1 

From this, Theorem 1 follows by Markov's inequality. 

4. Proof of Theorem 2(a) 

In this section, we prove Theorem 2. For this, we use the Aspvall-Plass-Tarjan-style [2] 
characterization of non-satisfiable signed 2-SAT formulas by Chepoi et al. [7] together with 
Chvatal and Reed's [8] trick of counting "bicycles". 

An l-bicycle contained in a 2-rSAT formula F is a sequence of 21 literals 



/ 



w{ 



and ii <E {1, 



1} 



together with two (not necessarily distinct) numbers i £ {2, . . . 
such that 

(bcl) the variables in w\, w^, ■ ■ ■ , w\ are all distinct; 

(bc2) the variables in the two literals wj,wf are the same, for % = 1, . . . ,t; 

(bc3) the variable of w[ is the same as the one of w\ Q , and the variable of w{ +l is the same 



as the one of ; 



L w J A V w i+1 " is a clause in F; 



(bc4) for each i = 0, . . . 

(bc5) for each i = 1, . . . , £, the constraint parts of wj, w[ are disjoint. 

The following is an immediate adaption of Chvatal and Reed's proof in [8] to Chepoi et 
al.'s [7] variant, for the signed case, of Aspvall et al.'s [2] characterization of non-satisfiable 
2-SAT formulas. 

Lemma 7. Every unsatisfiable 2-rSAT formula contains an l-bicycle, for some I > 2. □ 

As in the previous section, we will make use of the fact that, for \V\ = oo, we may 
assume that no two literals have the same right hand side. 

As above, let R = [Rf)j = x^ ^ n count the occurences of the variables in the random 
formula F = F^n^m, oo). Conditioned on R, we can recover the distribution of F as 
follows. Let there be n buckets B\, ... , B n ; bucket Bj contains Rj points. Note that there 
is an even number 2m of points. A perfect matching of the points corresponds to selecting 
two variables for each clause of the formula. Hence, drawing a matching at random and, 
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independently, drawing a random constraint part for each point, gives us a random formula. 
The distribution is the same as that for random formulas, conditioned on R. 

We now count the number of ^-bicycles in F by counting the corresponding matchings. 
We start with the following easy lemma. 

Lemma 8. Let d%, . . . ,d n be nonnegative integers. Then 

n 

(1) eJJor^. < (2c)£^ 

3=1 

(2) If all dj = 0(1), thenK\[ n j=1 (Rj) dj = (1 + o(l))(2c)^. 

Proof. This is a direct computation using (3). □ 

We are now ready to complete the proof. 

Proof of Theorem 2. Let b = (b\, . . . , be) G {1, . . . , n} e be the choice of the £ distinct 
variables in condition (1) of the definition of a bicycle above. For each one of these b, we 
have to choose io G {2, . . . , £} and i\ G {1, ...,£ — 1}. 

Letting db i := 2, i = 1, . . . , I, and dj := for each j not occuring in (6i, . . . , bg), the 
number of choices for the matching edges between the buckets corresponding to the clauses 
"wf Vw' +1 ", i = 1, . . . ,£— 1, conditioned on R, is Yl^iiRj)^- F° r the clauses "u>q \fw{" 

and "uijf V w\ + ^\ the number of choices depend on whether io = £, or i\ = 0, or both, 
or neither. For each choice of io an d h, if we change the dj to count the number of times 
the variable Xj occurs in the bicycle, the number of choices is Y[j=i(Rj)d r There are at 
most £ 2 choices for the io an d h, and we have ■ dj = 2(£ + 1), so that, by Lemma by 
Lemma 8, the expectation of the number of choices for the matching edges in the bicycle 
for fixed b and io,ii is at most (2c) 2 ( i+1 \ whereas the total number of choices for these 
matching edges equals (2m — l)(2m — 3) . . . (2m — 2£ — 1). 

There are (n)i possible choices of b, and the probability of disjointness in (5) is 1/4 for 
each variable (by Lemma 5), or A~ e for the whole bicycle. 

Thus, denoting by Yg_ the number of ^-bicycles in a random formula and by Xn,^^ the 
indicator variable of the event that a bicycle with these parameters exists, we may compute 
as follows: 



E ^= E KEE%a) R ) \ =E E (E E (*(Mo ) i 1 )i*o 

\ b J b io,h 

< s T4- i £ 2 (2c) 2{i+1) - ^ 

y ' (2m-l)(2m-3)...(2m-2£-l)J 

= rt 2 (2c) 2 ^ +1 ) ^ 



(2m - l)(2m - 3) . . . (2m -2£-l)' 

For the fraction, we use ad-hoc estimates. Noting that £ < n, we have 2m — 2£ > 2(c—l)n. 
By the monotonicity property, Lemma 3, it suffices to prove the theorem for c > 1, in which 
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case 2m — 2£ — 1 =: uj(n) — >• oo. From Stirling's formula, we see that 

(nje = _J tt n-J 

(2m - l)(2m - 3) • • • (2m - 2£ - 1) w (2c/ ^1 n _ (,-+i/a)/ c 



(c-l)2 . 

TT n — J 1 

" JLA « - (j+V2)/ c = (1 + ° (1)) ^7(2^ 

Summing over I, we see that 

£=2 £ 

for c < 2 (the constant in the big-0 depends on c). Thus, the expected number of bicycles 
of arbitrary length is O c ( l /n). From this, the statement of the theorem follows. □ 

5. Proof of Theorem 2(b) 

As in the previous section, to prove Theorem 2(b), we adapt the approach of Chvatal and 
Reed [8]: Prove the non-satisfiability by establishing the existence of an obstruction by the 
second moment method. 

For an even integer I > 6, we an (.-snake consists of a selection of I distinct vari- 
ables x 6l , . . . , x be , and clauses "(x 6 ., ^, a;) V (x 6j+1 ,^ +1 , a' i+1 )", i = 0,...,£, with b := 
bi/2 =■ bi + \ such that 

(ski) the constraint parts (p[, a[) and (pi,a,i) are disjoint, for % = 1, . . . 
(sk2) the constraint parts (p' e+1 , a' e+1 ) and (po, clq) are disjoint. 

(sk3) the constraint parts (p' e / 2 , ^/ 2 ) and (p' e+1 , a' e+l ) are disjoint; as well as the constraint 
parts (pe/2, 0-1/2) an d {po, oq) are disjoint. 

Lemma 9. If there exists an l-snake, then the formula is not satisiable. 

A snake gives rise to a srongly connected component in the digraph of the formula de- 
fined by Chepoi et al. [7], which contains a literal as well as its negation. Here, we give the 
elementary proof of the lemma. 

Proof. Assume that a satisfying interpretation x — > x exists. We prove that the literal 
(xb t /a i P'i/2i a 'i/2) can be neither satisfied nor violated by x. Suppose it were satisfied. Then, 
by disjointness of the constraint parts, (x& £/2 , pg/2, must be violated by x. Since there 
is a clause "(x& £/2 , p#/2, 0,1/2) v ( x b e/2+ i 1 P1/2+V a t/2+i)"> tne ^ ater ntera l must b e satisfied 
by x. Proceeding in this fashion, it follows that (xb e+1 , p' £+1 , a' £+1 ) is satisfied, but, since 
fy+i = bi/2 and by disjointness, this implies that (x& £/2 , p^ 2 , a'^ , 2 ) is violated, a contradic- 
tion. 

Suppose that (x6 </2 , p'^/ 2 , a' g , 2 ) is violated by x. Since there is a clause "( x 6f/ 2 -i 1 Pe/2+1^ a e/2-i^ 
(xb t /3>Pt/2,a>l/2)"> the literal ( x &f/ 2 _i>^/2+i> 4/2-1 ) must be sat i sr i ed bv x. Proceeding 
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as above, we conclude that (x& , po,ao) is satisfied by x, and hence, since 60 = ^+i> by 
disjointness, (x b(+1 , p' e+1 ,a' e+1 ) is violated. Since "(x^, p i+x , at) V (x be+1 , p' e+1 ,a' e+1 )" is 
aclause, (x be ,pi,ci£) is satisfied, and hence, eventually, so is (x^ /2 ,p^/2, 0^/2)- By disjoint- 
ness, then, (xfc , a o) i s violated, a contradiction. □ 

Fix a c > 2, let 

log n 



e-.= 2 



log(c/ 2 ) 

and denote by X the number of ^-snakes. We will compute the expectation of X and of 
its second factorial moment X(X — 1), and find that EX = Q(n) and EI(I — 1) = 
0((EI) 2 ). From this, by Chebyshev's inequality, we conclude that the probability that 
no £-snake exists is 0(1/ E X) = 0{ l /n). Thus, a random formula ^(n, cn, 00) is a.a.n. 
satisfiable. 

The following two lemmas comtain the computations of the moments. 
Lemma 10. EX = Q(n). 
Proof. As in the previous section, we have 

FfY I m = y (jV - ~ 3)nL(^)2 

1 1 ' ^ 4^+ 3 (2m-l)---(2m-2^-l) ' 

where the sum extends over all possible choices b € {1, . . . , n} e identifying the £ distinct 
variables in a snake. Thus, by Lemma 8, we have 

M = e((fe)=9(V»). 

□ 

Lemma 11. EX(X - 1) = 0((EI) 2 ). 

Proof. We have to compute the expectation of the sum 

where the sums range over all possible snakes 5 and 5', respectively, and X$ denotes the 
indicator variable of the event that the snake S exists. Taking the expectation, it can be seen 
that the only non-negligible terms in this sum are those for which S and S' are supported 
on disjoint sets of variables (see e.g. the computation in §9.2 of [13]). 

Whenever 5 and S' are supported on the disjoint sets of variables, a simple calculation 
shows that the expectation is 0((E X) 2 ). □ 



This completes the proof of Theorem 2(b). 



12 



CHRISTIAN LAUS AND DIRK OLIVER THEIS 



6. Conclusion and Outlook 

What makes random fc-rSAT intriguing is the presence of a second parameter next to 
c = m /n. the cardinality v of the truth-value set V. Since the probability of satisfiability 
Pk{n,m,v) increases with v (Lemma 3), cj:(v) increases with v, too. Based on compu- 
tational experiments, Bejar et al. [6] predicted that cs{v) increases logarithmically with v. 
This is clearly not the case, by Theorem 1. However, based on Bejar et al.'s data, we con- 
jecture that the functions c^{v ) are strictly concave. 

Conjecture. For all k > 2 and v > 2, we have c^(v + 1) — c^(v) > cj:(v + 2) — c^{v). 
Note that Pk{n, cn, v) is not in general concave in v. 

Bejar et al. [6] conjecture (for k = 3) that = c^, in other words, there is a threshold 
behaviour. This can be rephrased in terms of the dependence on the parameter v : 

Question. Does there exists a v^{c) such that pk{n,cn,v) = o(l) if v < vUc), and 
p k (n,cn,v) = 1 - o(l) ifv > v* k {c)? 

For example, in the case of 2-rSAT, we know that p2(n, |n, 2) = o(l) [8, 9, 11], and 
P2{n,^n,oo) = 1 — o(l) by Theorem 2, but it is not clear whether the transition happens 
gradually, or suddenly, at some value v* between 2 and 00. If v* exists, though, then it must 
be "finite" in the sense that it does not depend on n, cf. Fig. 1. 

In the case of 2-rSAT, f or v = 2 there is a threshold at c^~(2) = c^"(2) = 1, and for 
v = 00, there is a threshold at (00) = &j~(oo) = 2. It seems likely that this is true for the 
remaining values of v, too. In fact, we conjecture the following behaviour. 

Conjecture. For all A = 0, 1, 2, ... ,00, we have 

x 

c 2 ~(2 A + 1) = c+(2 A + 1) = 2 - 2~ x =J2 2 ~ j - 

j=0 
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